Class Project: Analysis of the S&P 500 to show historical trends based on past data¶
David Meduga S&DS 123
Introduction¶
The "The S&P500 is a market-capitalization-weighted index of 500 leading publicly traded companies in the U.S. The index actually has 503 components because three of them have two share classes listed."(investopedia) The S&P500 is a leading indicator in the finanical world that tells the world about how the economy is doing and the financial wellbeing of the top 500 companies. The object of this project is to make a analysis of the S&P500 and based on this data show investing strageties where it was most profitable to make a trade and either buy long or sell short. In my Final Project, I will be coding the adjusted returns from the S&P 500, taking the growth rate of specific stocks, use the top 9 stocks that makes up a majority of the S&P500 to show there growth over time, and using a basic investment strategy to show when it was the time to optimally purchase or sell a indidual stock in the S&P500.
The project focus is on the Financial world because the financial world encompases everything about the economy and a impact in the stock market is usually due to a world problem that causes a serious problem in the economy. Being able to track the stock market and see the changes in real time is staying ahead of the game and having this information is very valuable. This information leads to profits, real world impacts, and innovation. This is why I am doing a analysis on the S&P500 to see the historical trends and make trading decisions based on previous data.
There have been many other analysis of the the S&P500 ranging from corporations such as J.P. Morgan, Citidel, DRW Trading group. Each group has there own unique way of analysing the data and comes up with different conclusions on the most profitable trading strageties. They each use different techinques to bring about different amounts of profit and have a ranging varity of teams decitated to research the next stragety.
I obtained the data from the python package yfinance which is from yahoo finance. I used the link https://en.wikipedia.org/wiki/List_of_S%26P_500_companies to get the list of the S&P500 companies and there Ticker Symbols. I used the package yfinance
Data Wrangling¶
I am using adj_close = adjusted close of the stock,df= dataframe of the downloaded data from yfinance, fig = interavtive plots, top9stocks= top 9 stock in the S&P500, and rets = return rate. All other equations functions are explained as needed above the code.
Data installation¶
#run the pip install if you haven't already installed the data
#pip install yfinance
import pandas as pd
import yfinance as yf
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime
%matplotlib inline
S&P500 Companies¶
#The Wikipedia URL of the S&P 500
sp_wiki_url = "https://en.wikipedia.org/wiki/List_of_S%26P_500_companies"
# Reading the HTML table into a list of DataFrames
sp_wiki_df_list = pd.read_html(sp_wiki_url)
# Only getting the information needed containing the company information
sp_df = sp_wiki_df_list[0]
# Extract the ticker symbol columns
sp_ticker_list = list(sp_df['Symbol'].values)
#now downloading the ticker into yahoo finance to extract the rest of the stocks data over a 24 year period
df = yf.download(sp_ticker_list, start='2000-01-01', end='2024-01-01')
[*********************100%%**********************] 503 of 503 completed
5 Failed downloads:
['GEV', 'SOLV']: Exception("%ticker%: Data doesn't exist for startDate = 946702800, endDate = 1704085200")
['BRK.B']: Exception('%ticker%: No timezone found, symbol may be delisted')
['MSI']: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')
['BF.B']: Exception('%ticker%: No price data found, symbol may be delisted (1d 2000-01-01 -> 2024-01-01)')
#Getting the Adjusted close for each Stock
adj_close = df['Adj Close']
#putting it into a colum
adj_close.columns
Index(['A', 'AAL', 'AAPL', 'ABBV', 'ABNB', 'ABT', 'ACGL', 'ACN', 'ADBE', 'ADI',
...
'WTW', 'WY', 'WYNN', 'XEL', 'XOM', 'XYL', 'YUM', 'ZBH', 'ZBRA', 'ZTS'],
dtype='object', name='Ticker', length=503)
adj_close.head()
| Ticker | A | AAL | AAPL | ABBV | ABNB | ABT | ACGL | ACN | ADBE | ADI | ... | WTW | WY | WYNN | XEL | XOM | XYL | YUM | ZBH | ZBRA | ZTS |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||||||||||||||
| 2000-01-03 | 43.613018 | NaN | 0.846127 | NaN | NaN | 8.992852 | 1.277778 | NaN | 16.274670 | 28.438288 | ... | NaN | 11.505338 | NaN | 6.977998 | 18.328693 | NaN | 4.680299 | NaN | 25.027779 | NaN |
| 2000-01-04 | 40.281456 | NaN | 0.774790 | NaN | NaN | 8.735913 | 1.270833 | NaN | 14.909397 | 26.999628 | ... | NaN | 11.073110 | NaN | 7.138671 | 17.977619 | NaN | 4.586221 | NaN | 24.666668 | NaN |
| 2000-01-05 | 37.782803 | NaN | 0.786128 | NaN | NaN | 8.719850 | 1.388889 | NaN | 15.204174 | 27.393782 | ... | NaN | 11.659700 | NaN | 7.414122 | 18.957693 | NaN | 4.609741 | NaN | 25.138889 | NaN |
| 2000-01-06 | 36.344170 | NaN | 0.718097 | NaN | NaN | 9.024966 | 1.375000 | NaN | 15.328291 | 26.644888 | ... | NaN | 12.205120 | NaN | 7.345257 | 19.937765 | NaN | 4.570543 | NaN | 23.777779 | NaN |
| 2000-01-07 | 39.372864 | NaN | 0.752113 | NaN | NaN | 9.121319 | 1.451389 | NaN | 16.072981 | 27.393782 | ... | NaN | 11.803779 | NaN | 7.345257 | 19.879251 | NaN | 4.468629 | NaN | 23.513889 | NaN |
5 rows × 503 columns
#pulling the S&P500 from yahoo finance and then using an interactive plot to show
sp_ticker_list = ['^GSPC']
df = yf.download(sp_ticker_list, start='2000-01-01', end='2024-01-01')
#An interactive plot using Plotly
fig = px.line(df, x=df.index, y='Adj Close', title='S&P500 Adjusted Close Price')
fig.update_layout(
xaxis_title='Date',
yaxis_title='Price',
height=800,
width=1000
)
fig.show()
[*********************100%%**********************] 1 of 1 completed
#this is the specific S&P500 data I am pulling to show over time the growth of the index
df = yf.download("^GSPC", start = '1980-01-01')
[*********************100%%**********************] 1 of 1 completed
#this is a interactive Candlestick graph which shows the open, close, high, and low of the S&P500
fig = go.Figure(data = [go.Candlestick(
x = df.index,
open = df['Open'],
close = df['Close'],
high = df['High'],
low =df['Low']
)
])
fig.update_layout(title='Candlestick of the S&P500')
fig.show()
#this is an interactive candlestick chart to measure when the stock was going up and down in a bigger version
fig.update_layout(
title='Interactive Candlestick Chart',
width=1200, # Set the width of the chart
height=800, # Set the height of the chart
xaxis_rangeslider_visible=False, # Hide the range slider
showlegend=False # Hide the legend
)
Real World Impacts¶
#Historical Financial crashes
important_dates = {
'The Oil Crisis: 1982-04-29',
'The Tech Bubble: 2000-09-11',
'Financial Crisis and Great Recession: 2007-10-12',
'Covid 19 Pandemic:2020-03-20',
}
#using the interactive graph from earlier and putting a line when the financial Crashes happened
fig = go.Figure(data = [go.Candlestick(
x = df.index,
open = df['Open'],
close = df['Close'],
high = df['High'],
low =df['Low'],
)])
fig.update_layout(
title= "S&P500 Shocks",
yaxis_title= "S&P500 Stock",
shapes = [
dict(
x0 = '1982-04-29', x1= '1982-04-29', y0=0, y1=1, xref = 'x',
yref= 'paper', line_width = 2),
dict(
x0 = '2000-09-11', x1= '2000-09-11', y0=0, y1=1, xref = 'x',
yref= 'paper', line_width = 2),
dict(
x0 = '2007-10-12', x1= '2007-10-12', y0=0, y1=1, xref = 'x',
yref= 'paper', line_width = 2),
dict(
x0 = '2020-03-20', x1= '2020-03-20', y0=0, y1=1, xref = 'x',
yref= 'paper', line_width = 2)
],
annotations = [
dict(x= '1982-04-29', y = 1.1 , xref ='x', yref = 'paper', showarrow = False,
xanchor = 'left', text = 'The Oil Crisis'),
dict(x= '2000-09-11', y = 1.1, xref ='x', yref = 'paper', showarrow = False,
xanchor = 'left', text = 'The Tech Bubble'),
dict(x= '2007-10-12', y = 1.1, xref ='x', yref = 'paper', showarrow = False,
xanchor = 'left', text = 'Financial Crisis and Great Recession'),
dict(x= '2020-03-20', y = 1.1, xref ='x', yref = 'paper', showarrow = False,
xanchor = 'left', text = 'Covid 19 Pandemic')
]
)
fig.update_layout(xaxis_rangeslider_visible= False)
Now Let's view the top stocks that make up a large portion of the S&P500¶
top9stocks = dict(
AAPL = "Apple stock",
AMZN = 'Amazon Stock',
NVDA = 'NVIDIA',
GOOGL = 'Alphabet Class A',
TSLA = 'Tesla',
GOOG = 'Alphabet Class C',
META = 'Meta Platforms Class A',
MSFT = 'Miscroft Stock ',
UNH = 'United Health Group'
)
list(top9stocks.keys())
['AAPL', 'AMZN', 'NVDA', 'GOOGL', 'TSLA', 'GOOG', 'META', 'MSFT', 'UNH']
df_top9 = yf.download(list(top9stocks.keys()))
[*********************100%%**********************] 9 of 9 completed
adj_close_top9 = df_top9['Adj Close']
adj_close_top9 = adj_close_top9.dropna()
Top 9 Stocks Adjusted Close Over Time¶
adj_close_top9.dropna().plot(subplots = True, figsize = (12,10));
#A nice version of all the stock and there respected name
for key, value in top9stocks.items():
print(f"{key:8s} | {value}")
AAPL | Apple stock AMZN | Amazon Stock NVDA | NVIDIA GOOGL | Alphabet Class A TSLA | Tesla GOOG | Alphabet Class C META | Meta Platforms Class A MSFT | Miscroft Stock UNH | United Health Group
#the dtype of the stocks
adj_close_top9.info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 2999 entries, 2012-05-18 to 2024-04-19 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 AAPL 2999 non-null float64 1 AMZN 2999 non-null float64 2 GOOG 2999 non-null float64 3 GOOGL 2999 non-null float64 4 META 2999 non-null float64 5 MSFT 2999 non-null float64 6 NVDA 2999 non-null float64 7 TSLA 2999 non-null float64 8 UNH 2999 non-null float64 dtypes: float64(9) memory usage: 234.3 KB
#roundung to the scond digit
adj_close_top9.describe().round(2)
| Ticker | AAPL | AMZN | GOOG | GOOGL | META | MSFT | NVDA | TSLA | UNH |
|---|---|---|---|---|---|---|---|---|---|
| count | 2999.00 | 2999.00 | 2999.00 | 2999.00 | 2999.00 | 2999.00 | 2999.00 | 2999.00 | 2999.00 |
| mean | 70.38 | 76.99 | 64.34 | 64.41 | 165.14 | 135.02 | 103.73 | 84.41 | 236.54 |
| std | 58.51 | 55.04 | 40.36 | 39.77 | 101.73 | 111.24 | 155.68 | 105.37 | 159.58 |
| min | 11.98 | 10.41 | 13.92 | 13.99 | 17.71 | 21.51 | 2.61 | 1.74 | 42.57 |
| 25% | 24.13 | 21.34 | 29.21 | 29.70 | 81.98 | 40.37 | 5.26 | 14.35 | 100.27 |
| 50% | 41.15 | 77.25 | 53.48 | 53.77 | 159.09 | 90.07 | 42.74 | 20.17 | 211.89 |
| 75% | 129.02 | 122.62 | 95.52 | 95.16 | 212.25 | 233.31 | 137.71 | 180.94 | 379.28 |
| max | 197.86 | 189.05 | 160.79 | 159.41 | 527.34 | 429.37 | 950.02 | 409.97 | 548.93 |
adj_close_top9.mean()
Ticker AAPL 70.383720 AMZN 76.986150 GOOG 64.340510 GOOGL 64.413177 META 165.136826 MSFT 135.015776 NVDA 103.728866 TSLA 84.414281 UNH 236.542746 dtype: float64
#putting all the data gathered together
adj_close_top9.aggregate(['min', 'mean', 'std', 'median', 'max']).round(2)
| Ticker | AAPL | AMZN | GOOG | GOOGL | META | MSFT | NVDA | TSLA | UNH |
|---|---|---|---|---|---|---|---|---|---|
| min | 11.98 | 10.41 | 13.92 | 13.99 | 17.71 | 21.51 | 2.61 | 1.74 | 42.57 |
| mean | 70.38 | 76.99 | 64.34 | 64.41 | 165.14 | 135.02 | 103.73 | 84.41 | 236.54 |
| std | 58.51 | 55.04 | 40.36 | 39.77 | 101.73 | 111.24 | 155.68 | 105.37 | 159.58 |
| median | 41.15 | 77.25 | 53.48 | 53.77 | 159.09 | 90.07 | 42.74 | 20.17 | 211.89 |
| max | 197.86 | 189.05 | 160.79 | 159.41 | 527.34 | 429.37 | 950.02 | 409.97 | 548.93 |
adj_close_top9.diff().head()
| Ticker | AAPL | AMZN | GOOG | GOOGL | META | MSFT | NVDA | TSLA | UNH |
|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||
| 2012-05-18 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2012-05-21 | 0.934280 | 0.2130 | 0.341470 | 0.343093 | -4.195549 | 0.385847 | 0.048159 | 0.080667 | 1.297382 |
| 2012-05-22 | -0.130314 | -0.1390 | -0.331507 | -0.333083 | -3.026787 | 0.008038 | -0.034400 | 0.135333 | 0.141380 |
| 2012-05-23 | 0.410906 | 0.0975 | 0.215691 | 0.216717 | 0.998940 | -0.522503 | 0.068798 | 0.014667 | -0.299397 |
| 2012-05-24 | -0.158438 | -0.1020 | -0.144458 | -0.145144 | 1.028908 | -0.032152 | -0.075677 | -0.049333 | 0.715218 |
adj_close_top9.pct_change().round(3).head()
| Ticker | AAPL | AMZN | GOOG | GOOGL | META | MSFT | NVDA | TSLA | UNH |
|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||
| 2012-05-18 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2012-05-21 | 0.058 | 0.020 | 0.023 | 0.023 | -0.110 | 0.016 | 0.017 | 0.044 | 0.029 |
| 2012-05-22 | -0.008 | -0.013 | -0.022 | -0.022 | -0.089 | 0.000 | -0.012 | 0.071 | 0.003 |
| 2012-05-23 | 0.024 | 0.009 | 0.014 | 0.014 | 0.032 | -0.022 | 0.025 | 0.007 | -0.006 |
| 2012-05-24 | -0.009 | -0.009 | -0.010 | -0.010 | 0.032 | -0.001 | -0.027 | -0.024 | 0.016 |
Bar Graph of the Percentage Change¶
adj_close_top9.pct_change().mean().plot(kind = "bar", figsize = (5,5));
#shifting the data by 1 row down
adj_close_top9.shift(1)
| Ticker | AAPL | AMZN | GOOG | GOOGL | META | MSFT | NVDA | TSLA | UNH |
|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||
| 2012-05-18 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2012-05-21 | 16.036409 | 10.692500 | 14.953949 | 15.025025 | 38.189480 | 23.528458 | 2.770252 | 1.837333 | 44.901154 |
| 2012-05-22 | 16.970690 | 10.905500 | 15.295419 | 15.368118 | 33.993931 | 23.914305 | 2.818411 | 1.918000 | 46.198536 |
| 2012-05-23 | 16.840376 | 10.766500 | 14.963912 | 15.035035 | 30.967144 | 23.922342 | 2.784012 | 2.053333 | 46.339916 |
| 2012-05-24 | 17.251282 | 10.864000 | 15.179603 | 15.251752 | 31.966084 | 23.399839 | 2.852809 | 2.068000 | 46.040520 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-04-15 | 176.550003 | 186.130005 | 159.190002 | 157.729996 | 511.899994 | 421.899994 | 881.859985 | 171.050003 | 439.200012 |
| 2024-04-16 | 172.690002 | 183.619995 | 156.330002 | 154.860001 | 500.230011 | 413.640015 | 860.010010 | 161.479996 | 445.630005 |
| 2024-04-17 | 169.380005 | 183.320007 | 156.000000 | 154.399994 | 499.760010 | 414.579987 | 874.150024 | 157.110001 | 468.890015 |
| 2024-04-18 | 168.000000 | 181.279999 | 156.880005 | 155.470001 | 494.170013 | 411.839996 | 840.349976 | 155.449997 | 478.989990 |
| 2024-04-19 | 167.039993 | 179.220001 | 157.460007 | 156.009995 | 501.799988 | 404.269989 | 846.710022 | 149.929993 | 493.179993 |
2999 rows × 9 columns
#Gettign the return rate of each stock
rets = np.log(adj_close_top9 / adj_close_top9.shift (1))
Return Rate Graph for Top 9 Stock in the S&P500¶
rets.cumsum().apply(np.exp).plot(figsize = (10,5));
#each graph indidually
rets.cumsum().apply(np.exp).plot(subplots = True, figsize = (12,10));
Resampling the data¶
# Resampling the data to every week the data will show
adj_close_top9.resample('1w', label = 'right').last().head()
| Ticker | AAPL | AMZN | GOOG | GOOGL | META | MSFT | NVDA | TSLA | UNH |
|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||
| 2012-05-20 | 16.036409 | 10.6925 | 14.953949 | 15.025025 | 38.189480 | 23.528458 | 2.770252 | 1.837333 | 44.901154 |
| 2012-05-27 | 17.001230 | 10.6445 | 14.733027 | 14.803053 | 31.876179 | 23.359653 | 2.843637 | 1.987333 | 46.672581 |
| 2012-06-03 | 16.961918 | 10.4110 | 14.221195 | 14.288789 | 27.690619 | 22.869303 | 2.747319 | 1.876667 | 45.774391 |
| 2012-06-10 | 17.546375 | 10.9240 | 14.457061 | 14.525776 | 27.071278 | 23.833916 | 2.779425 | 2.005333 | 48.236088 |
| 2012-06-17 | 17.359219 | 10.9175 | 14.060049 | 14.126877 | 29.978193 | 24.131346 | 2.818411 | 1.994000 | 49.165642 |
# Resampling the data to every month the data will show
adj_close_top9.resample('1m', label = 'right').first().head()
| Ticker | AAPL | AMZN | GOOG | GOOGL | META | MSFT | NVDA | TSLA | UNH |
|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||
| 2012-05-31 | 16.036409 | 10.6925 | 14.953949 | 15.025025 | 38.189480 | 23.528458 | 2.770252 | 1.837333 | 44.901154 |
| 2012-06-30 | 16.961918 | 10.4110 | 14.221195 | 14.288789 | 27.690619 | 22.869303 | 2.747319 | 1.876667 | 45.774391 |
| 2012-07-31 | 17.915249 | 11.4660 | 14.457560 | 14.526276 | 30.737387 | 24.565413 | 3.084428 | 2.026667 | 46.961941 |
| 2012-08-31 | 18.347324 | 11.6045 | 15.757935 | 15.832833 | 20.857868 | 23.640997 | 3.070669 | 1.750000 | 42.746555 |
| 2012-09-30 | 20.495808 | 12.3940 | 16.962421 | 17.043043 | 17.711208 | 24.590599 | 3.045443 | 1.876000 | 45.542900 |
# Newly resampled graph to show every month return rate
rets.cumsum().apply(np.exp).resample("1m", label = "right").last().plot(figsize = (10,5));
Total Price of Top 9 Stock in the S&P500¶
top9_stocks = dict(
AAPL = "Apple stock",
AMZN = 'Amazon Stock',
NVDA = 'NVIDIA',
GOOGL = 'Alphabet Class A',
TSLA = 'Tesla',
GOOG = 'Alphabet Class C',
META = 'Meta Platforms Class A',
MSFT = 'Miscroft Stock ',
UNH = 'United Health Group'
)
#dowloading the respective data
df1 = yf.download(list(top9_stocks.keys()))
[*********************100%%**********************] 9 of 9 completed
#dropping the NA's
df1= df1['Open'].dropna()
#The most recent data
df1.tail()
| Ticker | AAPL | AMZN | GOOG | GOOGL | META | MSFT | NVDA | TSLA | UNH |
|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||
| 2024-04-12 | 174.259995 | 187.720001 | 159.404999 | 157.960007 | 517.750000 | 424.049988 | 896.989990 | 172.339996 | 440.339996 |
| 2024-04-15 | 175.360001 | 187.429993 | 160.279999 | 158.860001 | 516.719971 | 426.600006 | 890.979980 | 170.240005 | 442.000000 |
| 2024-04-16 | 171.750000 | 183.270004 | 155.639999 | 154.190002 | 498.109985 | 414.570007 | 864.330017 | 156.740005 | 476.769989 |
| 2024-04-17 | 169.610001 | 184.309998 | 157.190002 | 155.619995 | 503.100006 | 417.250000 | 883.400024 | 157.639999 | 478.600006 |
| 2024-04-18 | 168.029999 | 181.470001 | 156.925003 | 155.339996 | 499.820007 | 410.630005 | 849.700012 | 151.250000 | 486.130005 |
#now let look at Amazon Stock
symbol = 'AMZN'
window = 20
df1['min'] = df1[symbol].rolling(window=window).min()
df1['mean'] = df1[symbol].rolling(window=window).mean()
df1['std'] = df1[symbol].rolling(window=window).std()
df1['median'] = df1[symbol].rolling(window=window).median()
df1['max'] = df1[symbol].rolling(window=window).max()
df1['ewma'] = df1[symbol].ewm(halflife= .5, min_periods = window).mean()
Amazon stock price change over a year¶
ax = df1[['min', 'mean', 'max']].iloc[-200:].plot(
figsize = (10,6), style = ['g--', 'r--', 'g--'],
lw=.6
)
df1[symbol].iloc[-200:].plot(ax=ax, lw= 2.0);
Techinical Analysis Using Simple Moving Average¶
#Using rolling Statistics
df1['SMA1'] = df1[symbol].rolling(window=42).mean()
df1['SMA2'] = df1[symbol].rolling(window=252).mean()
df1[[symbol, 'SMA1', 'SMA2']].tail()
| Ticker | AMZN | SMA1 | SMA2 |
|---|---|---|---|
| Date | |||
| 2024-04-12 | 187.720001 | 176.936191 | 141.374326 |
| 2024-04-15 | 187.429993 | 177.405238 | 141.725437 |
| 2024-04-16 | 183.270004 | 177.740000 | 142.047659 |
| 2024-04-17 | 184.309998 | 178.066905 | 142.369683 |
| 2024-04-18 | 181.470001 | 178.370000 | 142.677302 |
df1.dropna(inplace = True)
#this is 1 is holding long and 2 is short
df1['positions'] = np.where(df1['SMA1'] > df1['SMA2'], 1, -1)
Ploting Simple Moving Average¶
ax= df1[[symbol, 'SMA1', 'SMA2', 'positions']].plot(figsize= (10,6),
secondary_y = 'positions')
ax.get_legend().set_bbox_to_anchor((0.25, 0.85))
Individual Stock Comparison¶
# Two Stocks competing in the same industry
stock1 = 'AAPL'
stock2 = 'MSFT'
df3 = yf.download([stock1, stock2])
[*********************100%%**********************] 2 of 2 completed
df3.dropna(inplace=True)
#comparison the two stock individualy
df3['Adj Close'].plot(subplots=True, figsize = (10,6));
df3 = df3['Adj Close']
Graph of Apple and Miscroft Stock rise Over Time¶
# Comparision of the stocks on the same graph
df3.loc['2010':'2019'].plot(secondary_y=stock2, figsize= (12,6));
df_small = df3.loc['2010':'2019']
#gathering the Return Rate of the Stocks
rets1 = np.log(df_small / df_small.shift(1))
rets1.dropna(inplace=True)
#Plotting the return rate
rets1.plot(subplots = True, figsize = (10,6));
# Average increase
reg = np.polyfit(rets1[stock1], rets1[stock2], deg =1)
ax = rets1.plot(kind= 'scatter', x = stock1, y=stock2, figsize = (10,6))
ax.plot(rets1[stock1], np.polyval(reg, rets1[stock1]), 'r', lw=2);
Correlation¶
rets1.corr()
| Ticker | AAPL | MSFT |
|---|---|---|
| Ticker | ||
| AAPL | 1.000000 | 0.458318 |
| MSFT | 0.458318 | 1.000000 |
#where the data from above shows on a graph with flucations
ax = rets1[stock1].rolling(window=252).corr(rets1[stock2]).plot(figsize=(10,6))
ax.axhline(rets1.corr().iloc[0,1], c = 'r');
import plotly.io as pio
pio.renderers.keys()
import plotly.io as pio
pio.renderers.default = 'jupyterlab'
pio.renderers.default = 'notebook'
Conclusion¶
In conclusion, the finding of doing the analysis of the S&P500 include: How different stocks are corelated with one another, how less frequent simple trading strategies actually trades, and how indidual stock moves in a general direction. From this analysis, I know how to use the package yfinance in python and do a deep dive into a stock analysis getting the returns and growth rate over time. This is very helpful in insight into investing in any stock. This code is very verstile is discovering a trend among stocks and can compare stocks to each other to find their corelation rate to diversify their portfolio. I also learned that every substantial dip/loss when analyzing the S&P500 is usually due a substantial world diaster or real world economy problem. Knowing this information being able to stay informed throughout news channels and articles will help me better understand the stock market and know how a certian event will influence the market. This analysis will help others to look at stocks they are interested in and be able to see inforamtion in a orgainzed way to help better the individual investments. Using Python we are able to explore all this data and discover model to optimize our investments.
Reflection¶
This was a fun assignment to do because I was able to see trends I never thought existed and be able to see how stocks change over time. This was a fun assignment to put together of how useful python is in the real world and how different profession do this for a living. For someone looking to invest this was a fun assignment on using data to find the best time to buy/sell a investment. It was difficult to plot the data and use the package yfinance effectivly but it was a good expereince in dealing with data.
Changes for Final Version¶
The changes I want to make for the final version include
- Adding a search feature where you enter in a stock and it will return you all the data through that one prompt
- A simulator where you can see how much money you will have if you invested a certain amount at a certain time in a stock.